# Low character error rate
Phi 4 Multimodal Instruct Ko Asr
A Korean automatic speech recognition (ASR) and speech translation (AST) model fine-tuned based on microsoft/Phi-4-multimodal-instruct, demonstrating excellent performance on the zeroth-korean and fleurs datasets.
Text-to-Audio
Transformers Korean

P
junnei
354
3
Whisper Large V3 Cantonese
Apache-2.0
A Cantonese automatic speech recognition model fine-tuned on Whisper v3, trained on the Common Voice 17 dataset
Speech Recognition
Transformers Other

W
khleeloo
25
4
Court Records Htr
MIT
A handwriting recognition model fine-tuned from Microsoft's TrOCR, specialized for 19th-century Finnish and Swedish court record documents
Text Recognition
C
Kansallisarkisto
24
0
Hubert Uk
A Ukrainian automatic speech recognition model trained on the mHuBERT-147 base model, supporting Ukrainian speech-to-text tasks.
Speech Recognition Other
H
Yehor
31
4
Trocr Base Printed Captcha Ocr
A captcha OCR model fine-tuned based on microsoft/trocr-base-printed, designed to extract text from image captchas.
Text Recognition
Transformers English

T
DunnBC22
272
8
Whisper Large V2 Mn 13
Apache-2.0
A Mongolian speech recognition model fine-tuned on Mongolian datasets based on OpenAI's whisper-large-v2 model, supporting automatic speech recognition tasks in Mongolian.
Speech Recognition
Transformers Other

W
bayartsogt
161
6
Whisper Large V2 Cantonese
Apache-2.0
A Cantonese automatic speech recognition (ASR) model fine-tuned based on OpenAI Whisper Large V2, trained on the Common Voice 11.0 Cantonese dataset with a character error rate (CER) of 6.21%.
Speech Recognition
Transformers Other

W
Scrya
210
7
Whisper Small Chinese Base
Apache-2.0
A Chinese speech recognition model fine-tuned on the google/fleurs cmn_hans_cn dataset based on openai/whisper-small
Speech Recognition
Transformers

W
Jingmiao
117
23
Whisper Large V2 Cantonese
Apache-2.0
An automatic speech recognition model fine-tuned on Cantonese dataset based on OpenAI Whisper Large V2, achieving a character error rate of 6.7274% on the test set
Speech Recognition
Transformers Other

W
simonl0909
131
12
Wav2vec2 Bloom Speech Tgl
Other
A Tagalog speech recognition model fine-tuned based on facebook/wav2vec2-xls-r-300m
Speech Recognition
Transformers Other

W
sil-ai
3,412
0
Wav2vec2 Large Xlsr 53 Cantonese
Apache-2.0
A Cantonese fine-tuned speech recognition model based on facebook/wav2vec2-large-xlsr-53 using the Common Voice corpus version 8.0
Speech Recognition
Transformers Other

W
CAiRE
1,214
3
Wav2vec2 Xls R 1b Italian Doc4lm 5gram
Apache-2.0
Italian speech recognition model fine-tuned from XLS-R 1B parameter model, supports recognition with language model
Speech Recognition
Transformers Other

W
radiogroup-crits
19
1
Wav2vec2 Xlsr 300m Finnish Lm
Apache-2.0
A Finnish automatic speech recognition model fine-tuned based on facebook/wav2vec2-xls-r-300m, trained with 275.6 hours of Finnish annotated data, supports use with KenLM language model.
Speech Recognition
Transformers Other

W
Finnish-NLP
28.39k
0
Wav2vec2 Xls R 1b Italian Robust
Apache-2.0
An Italian automatic speech recognition model fine-tuned on Common Voice 7 and Libri Speech datasets based on facebook/wav2vec2-xls-r-1b
Speech Recognition
Transformers Other

W
dbdmg
130
0
Wav2vec2 10july
Apache-2.0
This is a German automatic speech recognition model based on the XLSR Wav2Vec2 architecture, fine-tuned on the Common Voice German dataset.
Speech Recognition
Transformers German

W
sourabharsh
24
0
Wav2vec2 Large Xlsr 53 Polish
Apache-2.0
XLSR-53 large model speech recognition system optimized for Polish, fine-tuned based on facebook/wav2vec2-large-xlsr-53, supports Polish automatic speech recognition
Speech Recognition Other
W
jonatasgrosman
412.13k
11
Wav2vec2 Large Xls R 300m Ru
This is a Russian automatic speech recognition model based on the Wav2Vec2 XLS-R architecture with a parameter scale of 300m, evaluated on public speech and robust speech event datasets.
Speech Recognition
Transformers Other

W
mobedkova
37
1
Wav2vec2 Xls R 300m Es
Apache-2.0
This model is a fine-tuned Spanish automatic speech recognition model based on facebook/wav2vec2-xls-r-300m on the COMMON_VOICE - ES dataset.
Speech Recognition
Transformers Spanish

W
samitizerxu
23
0
Wav2vec2 Large Xlsr 53 Finnish
Apache-2.0
A Finnish automatic speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampled audio input
Speech Recognition
Transformers Other

W
vasilis
27
0
Wav2vec2 Large Xlsr 53 Estonian
Apache-2.0
An automatic speech recognition model fine-tuned for Estonian using the Common Voice dataset, based on facebook/wav2vec2-large-xlsr-53
Speech Recognition
Transformers Other

W
vasilis
26
0
Wav2vec2 Large Xlsr 53 Russian
Apache-2.0
A Russian speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampled audio input
Speech Recognition Other
W
jonatasgrosman
3.9M
54
Wav2vec2 Large Xlsr 53 Hungarian
Apache-2.0
This is a fine-tuned XLSR-53 large model for Hungarian speech recognition tasks, trained on Common Voice and CSS10 datasets.
Speech Recognition Other
W
jonatasgrosman
127.73k
9
Wav2vec2 Large Xlsr 53 Greek
Apache-2.0
A Greek speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, supporting 16kHz audio input.
Speech Recognition
Transformers Other

W
vasilis
25
0
Wav2vec2 Large Xlsr 53 Persian
Apache-2.0
XLSR-53 large model speech recognition system optimized for Persian, fine-tuned based on facebook/wav2vec2-large-xlsr-53 architecture
Speech Recognition Other
W
jonatasgrosman
257.76k
22
Wav2vec2 Xls R 300m Hy
Apache-2.0
An automatic speech recognition (ASR) model fine-tuned on Armenian language datasets based on facebook/wav2vec2-xls-r-300m, supporting Armenian speech-to-text tasks.
Speech Recognition
Transformers Other

W
arampacha
25
0
Wav2vec2 Large Xls R 1b Indonesian
Apache-2.0
An automatic speech recognition model fine-tuned on the Common Voice Indonesian dataset based on facebook/wav2vec2-xls-r-1b
Speech Recognition
Transformers Other

W
kingabzpro
14
1
Wav2vec2 Xlsr 1b Finnish
Apache-2.0
A fine-tuned version of Facebook's wav2vec2-xls-r-1b model for Finnish automatic speech recognition (ASR), trained with 259.57 hours of annotated Finnish speech data
Speech Recognition
Transformers Other

W
aapot
18
0
Xlsr 300m CV 8.0 50 EP New Params Nl
Apache-2.0
This is an automatic speech recognition (ASR) model based on the XLS-R architecture with 300M parameters, specifically optimized for Dutch and trained on the Common Voice 8.0 dataset.
Speech Recognition
Transformers Other

X
Iskaj
25
0
Xlsr300m Cv 7.0 Nl Lm
Apache-2.0
XLS-R-300M is an automatic speech recognition (ASR) model specifically optimized for Dutch, trained on the Common Voice 8 Dutch dataset.
Speech Recognition
Transformers Other

X
Iskaj
23
0
Wav2vec2 Large Xls R 300m Bg V1
Apache-2.0
This is an automatic speech recognition (ASR) model fine-tuned on Bulgarian speech datasets based on the facebook/wav2vec2-xls-r-300m model.
Speech Recognition
Transformers Other

W
DrishtiSharma
16
1
Featured Recommended AI Models